feat: Add counter metrics for consumed cycles#9922
Merged
michael-weigelt merged 1 commit intodfinity:masterfrom May 5, 2026
Merged
feat: Add counter metrics for consumed cycles#9922michael-weigelt merged 1 commit intodfinity:masterfrom
michael-weigelt merged 1 commit intodfinity:masterfrom
Conversation
dsarlis
commented
Apr 17, 2026
Contributor
Author
dsarlis
left a comment
There was a problem hiding this comment.
Open points:
- Naming is something that I deliberately did not spend too much on. I can totally understand if a suffix
_as_countersis not considered good enough and open to ideas for alternate names. - In tests you'll notice that I've added a few places where I check the new counters additionally to old ones. One might wonder "what if we try to add such checks universally in
ExecutionTest". I tried it and as it turns out it's not easy to make all tests work as some of them cut corners and you're not always able to do the check at "clear points", i.e. points where you know that there can be no outstanding callbacks (execute_allwould be the best candidate and you get majority of tests to work but you still miss some special ones, highly concentrated inhypervisor_tests). I'd leave as an improvement if anyone wants to pick it up.
Contributor
pierugo-dfinity
approved these changes
Apr 20, 2026
Contributor
pierugo-dfinity
left a comment
There was a problem hiding this comment.
Approving changes in rs/monitoring/metrics
michael-weigelt
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add counter versions of metrics for consumed cycles that are stored in
ReplicatedState. The existing ones behave like gauges (so their values can go down when prepayments are made and up when refunds are issued) which makes it more challenging for consumers to build automated monitoring tools to perform aggregations over them. By having them monotonically increase, it's easier to calculate rates of change, show aggregates over time etc.The key idea is to introduce a second map of
<CyclesUseCase, NominalCycles>in theReplicatedStatethat will only be updated once per use case: either at the payment stage if we know the precise amount or only at refund stage if a prepayment is made with an expected refund later. The second map is quite similar to the existing in all other aspects (how they are stored in checkpoints or how they are exposed as prometheus metrics) besides how the values are updated.A new map is introduced to ease the transition as migrating from the old map to new is non-trivial given that a proper cutoff point needs to be introduced to handle outstanding callbacks that might have been created before the metric introduction. This is left for a follow-up if and when people decide to do it. The new map will be used in a follow up that will implement the new management canister endpoint to retrieve canister level metrics.
Additionally, the new metrics include the use case
HttpsOutcallsin the canister level metrics as it's useful to determine how much each canister uses this feature. I've opted to not change existing metrics to do the same as it would make things less clean imo than the current approach -- a single specific API is used to perform this update in exactly one place where it's needed.The changes in the PR are mostly driven by the addition of the new map of metrics, updates in protobuf files to store the new metrics, the changes to support having the
HttpsOutcallsuse case additionally included as well as some changes in tests to support the new metrics.